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Aim To type a set of 1 94 US African American, Caucasian, 
and Hispanic samples (self-declared ancestry) for 40 auto- 
somal single nucleotide polymorphism (SNP) markers in- 
tended for human identification purposes. 

Methods Genotyping was performed on an automated 
commercial electrospray ionization time-of-flight mass 
spectrometer, the PLEX-ID. The 40 SNP markers were am- 
plified in eight unique 5plex PCRs, desalted, and resolved 
based on amplicon mass. For each of the three US sample 
groups statistical analyses were performed on the result- 
ing genotypes. 

Results The assay was found to be robust and capable of 
genotyping the 40 SNP markers consuming approximately 
4 nanograms of template per sample. The combined ran- 
dom match probabilities for the 40 SNP assay ranged from 
10" l6 to 10" 21 . 

Conclusion The multiplex PLEX-ID SNP-40 assay is the first 
fully automated genotyping method capable of typing 
a panel of 40 forensically relevant autosomal SNP mark- 
ers on a mass spectrometry platform. The data produced 
provided the first allele frequencies estimates for these 40 
SNPs in a National Institute of Standards and Technology 
US population sample set. No population bias was detect- 
ed although one locus deviated from its expected level of 
heterozygosity. 
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The forensic community has addressed the application of 
autosomal single nucleotide polymorphism (SNP) mark- 
ers for human identification (1-4). SNPs may be of utility 
when working with highly degraded DNA because they 
can be assayed with very small polymerase chain reac- 
tion (PCR) amplicons. Over the past 1 0 years, various SNP 
assays and candidate marker panels have been described 
(5-1 0). One set of interest is a panel of 40 autosomal SNP 
markers intended as a universal individual identification 
panel. These markers were selected for high heterozygos- 
ity and low F st values in studies of 40 populations to com- 
plement CODIS STR loci (8). Initially these markers were 
screened and typed for world populations by singleplex 
Taq Man-based assays. More recently, there have been at- 
tempts to develop multiplex assays for typing the 40 SNP 
panel (1 1 ). One of these is the PLEX-ID SNP-40 comprised 
of 8 unique 5plex PCRs developed by Abbott Molecular. 
The PLEX-ID instrument platform is a commercial elec- 
trospray ionization mass spectrometer capable of auto- 
mated analysis of short PCR amplicons (less than 140 bp) 
generated by proprietary assays (see SNP-40, mtDNA 2.0). 
The instrument desalts each PCR reaction through the 
use of magnetic bead chemistry and injects the desalted 
PCR reaction into the mass spectrometer. The peaks are 
separated and resolved based on time-of-flight analysis. 
With the emerging development of ultra-high through- 
put sequencing applied to forensics it will be more com- 
monplace to utilize these "core" SNP maker sets. Here we 
report the assay performance and allele frequencies for a 
subset of our National Institute of Standards andTechnol- 
ogy (NIST) US population samples. 

METHODS 

For this study, samples (n = 1 94) were selected from three 
population groups representative of major population seg- 
ments in the United States (African Americans = 74, Cauca- 
sians = 75, and Hispanics = 45). 

Whole blood with anonymous identifiers and self-described 
ancestry was obtained from commercial blood banks (In- 
terstate Blood Bank, Memphis, TN and Millennium Biotech, 
Fort Lauderdale, FL, USA). Blood samples were subjected to 
bulk DNA extraction using a modified salt-out procedure as 
described previously (12). DNA concentrations in extracts 
were determined using Quantifier Human DNA Quantifi- 
cation kit (Life Technologies, Carlsbad, CA, USA) on an Ap- 
plied Biosystems model 7500 (Life Technologies) real-time 
PCR instrument. Quantification values were then used 
to normalize all DNA extracts to a final concentration 



of 0.1 ng/uL for PCR amplification. All samples were previ- 
ously examined with 15 autosomal short tandem repeats 
and the amelogenin sex-typing marker using the AmpFI- 

TABLE 1 . Information for the 40 autosomal single nucleotide 
polymorphism (SNP) loci examined in this study sorted by 
chromosome position. Chromosome positions were deter- 
mined using the UCSC Genome Browser using Human Feb. 
2009 (GRCh37/hg1 9) assembly 
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STR Identifier kit (Applied Biosystems, Foster City, CA, USA) 
to verify that each sample was unique (1 3). 

SNP typing 

The 40 SNP markers typed by the PLEX-ID SNP-40 assay 
were previously selected and characterized on multiple 
world populations (8). The following data were obtained 
for the 40 SNP markers typed in the assay: dbSNP reference 
SNP (rs) number, nucleotide position (according to the Hu- 
man February 2009 (GRCh37/hg1 9) assembly), chromo- 
somal band, and physical distance from adjacent markers 
located on the same chromosome (Table 1). 

PCR amplification was performed as recommended by the 
manufacturer by adding a total of 0.5 ng in a 5 uL-volume 
of template DNA to each of eight wells in a column of a 
pre-fabricated SNP-40 assay plate (Abbott Molecular, Des 
Plaines, IL, USA). In total eight unique 5plex reactions were 
run per sample requiring approximately 4 ng of DNA tem- 
plate per sample. Template DNA was added to each well 
by using a pipette tip to pierce the foil seal covering the 
well to which sample was added. On each 96-well plate, 
ten unique templates were run in parallel with a no-tem- 
plate control and a positive amplification control, 9947a 
DNA, (Promega Corp., Madison, Wl, USA) at 0.1 ng/uL. After 
template addition, the PCR plate was re-sealed using PCR 
Foil seals (Abbott Molecular) on an ALPS 50V Heat Sealer 
(ThermoFisher Scientific, Waltham, MA, USA) by compress- 
ing the foil seal and PCR plate for 4 seconds at 1 80°C. The 
prepared 96-well plate was then briefly centrifuged and 
placed in a Mastercycler ProS thermal cycler (Eppendorf 
AG, Hamburg, Germany) for thermal cycling with the fol- 
lowing program: initial denaturation at 96°C for 10 min- 
utes; 40 cycles of denaturation at 96°C for 20 seconds, an- 
nealing at 58.5°C for 2 minutes, and extension at 72°C for 
1 0 seconds; followed by a final extension step at 72°C for 4 
minutes and a 4°C hold. 

Following PCR amplification, the 96-well plate was briefly 
centrifuged and placed in the input stacker of the PLEX-ID 
instrument for automated desalting and mass determina- 
tion as per manufacturer's recommended procedure. 

PCR products were purified by the PLEX-ID instrument 
using a proprietary magnetic bead chemistry to remove 
salts, enzymes, unincorporated nucleotides, and any other 
PCR components that might interfere with collection of 
mass spectra. Purified PCR product was eluted in a buffer 
containing two peptide standards with masses of 727.4 



Da and 1 347.7 Da, which act as calibrants to facilitate data 
processing. The electrospray ionization source operates 
in negative mode at approximately -4000 V (depending 
on the individual instrument's tuning parameters, which 
are not user configurable) and 300°C. PCR products were 
sprayed into the ionization source at a flow rate of 280 
uL per hour with dry compressed air used as a counter- 
current to aid in analyte desolvation. The time-of-flight 
analyzer collects 5000 scans per second, for a period of 
approximately 28 seconds. Masses were resolved based 
on differences in time elapsed to traverse the flight tube 
due to mass-to-charge ratio (m/z). Resultant mass spec- 
tra were processed by proprietary software (Ibis Track 
version 2.7), which performs several steps to produce a 
background-subtracted, deconvolved representation of 
the mass spectral data as if only the singly charged mass 
peak were detected, with mass (Daltons) on the x-axis and 
signal strength (arbitrary units) on the y-axis. Successfully 
detected masses were stored in a table that resides in the 
Ibis Track database. The resulting mass spectra were in- 
spected visually in IbisTrack software and any masses not 
correctly assigned by the software were manually added 
or deleted. 

Following review, genotypes for each sample group were 
exported from Ibis Track to Microsoft Excel 2010 for for- 
matting and further analysis with Power Marker, version 
3.25 and Arlequin software, version 3.5.1.2 (14,15). Allele 
frequencies, expected and observed heterozygosity val- 
ues, and P values (Fisher exact test for Hardy-Weinberg 
equilibrium) for each marker were calculated for the 
three US sample groups. The combined random match 
probabilities (RMP) for each sample were calculated us- 
ing Excel 2010. 

RESULTS 

Figure 1 illustrates an example mass spectrum obtained 
from this study. Each spectrum contains the products of 
a 5plex PCR reaction. Four signal peaks are typically ob- 
served for heterozygous loci, two forward and two reverse 
strands of DNA (see markers rs2272998 and rs445251 in 
Figure 1). Only two signals peaks are observed for ho- 
mozygous loci (see markers rs6591147, rs321198, and 
rs3780962). Of the 1 94 samples examined in this study, in- 
complete or partial genotypes were observed for 21 loci 
(21/7760 = 0.27%). Ten of the failures were due to data 
not transferring to the PLEX-ID server due to a commu- 
nication error. The remaining incomplete genotypes 
coincided with amplification reactions that exhib- 
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RS3780962 
Homozygote T/T 



RS445251 
Heterozygote C/G 

F R 



Mass (Da) 



FIGURE 1. An example mass spectrum is shown. This specific example contains 3 homozygous (rs6591 147, rs321 198, rs3780962) and 
2 heterozygous (rs2272998 and rs445251) markers. The forward and reverse strands for each polymerase chain reaction amplicon 
are highlighted. Note the mass interleaving of single nucleotide polymorphisms (SNP) rs6591 1 47 and rs321 1 98. 



ited poor signal over the entire 5plex.This may have been 
due to inefficient PCR or desalting in those specific ampli- 
fication reactions. There was no evidence of a single locus 
dropping out due to underlying SNPs that would affect 
PCR primer binding. 

The genotype data for the 194 samples was evaluated 
for the following parameters: allele frequencies, observed 
heterozygosity, expected heterozygosity, and P value (Ta- 
ble 2). The combined RMP for each sample was calculated 
based on the determined allele frequencies for the corre- 
sponding sample group (Table 3). 

DISCUSSION 

A total of 6 of the 120 tests (40 loci x3 sample groups) 
for Hardy-Weinberg equilibrium indicated a deviation 
from the expected result. Three were observed in the 
Caucasian sample group (rs1 01 9029, rsl 358856, and 
rs681 1 238), 2 in the African American group (rsl 523537 
and rs447818),and 1 in the Hispanic group (rsl 31 82883). 
It was shown that it can be expected to observe approxi- 
mately 5%, or 6 out of 1 20, of the comparisons to deviate 
from Hardy-Weinberg equilibrium (16,17). Significant 
values at the 95% confidence level were those less 
than 0.05 (5%). The Bonferroni correction of prob- 



ability for each population was 0.05/40 = 0.00125. Using 
this criterion only the SNP marker rsl 01 9029 would still 
be considered significant. 

Typically the minimum number of samples needed to 
provide a robust estimate for allele frequencies with loci 
containing 5 to 15 alleles is 100 to 150 samples for each 
population (1 8). Since in this study we measured bi-a I lei ic 
markers that only had three possible genotypes (AA, BB, 
or AB), a smaller number of samples is deemed to be suf- 
ficient - provided that a minimum allele frequency of 5/2N 
is utilized (1 9). An examination of the data for each sample 
group did not find any frequency measurements (out of 
the 360 total) below the 5/2N threshold. 

In the Caucasian sample group, the SNP marker rsl 01 9029 
exhibited a low P value (<0.0001) as well as a high ob- 
served heterozygosity of 0.733. The same marker gave an 
observed heterozygosity of 0.472 in the study by Pakstis 
et al over the 40 populations examined (8). An additional 
review of the mass spectral data did not reveal an obvious 
error with the genotyping assay. The high heterozygos- 
ity was not observed in the African American or Hispanic 
sample groups, suggesting that testing additional Cauca- 
sian samples and/or an alternate typing method would be 
needed to confirm this result. 
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TABLE 2. Allele frequencies observed for the 3 US sample groups listed by single nucleotide polymorphism (SNP) locus. Each SNP 
is identified with the corresponding dbSNP rs number. The format for each allele is listed in the line above the frequency. Example 
rsl 0092491 [C/T] where A = C and B =T. The P values less than 5% for the markers rsl 01 9029, rsl 358856, rs681 1 238, rsl 523537, 
rs44781 8, and rsl 31 82883 are bolded 



He 
Hob 
fA 
fB 

P-value 

He 
Hob 
fA 
fB 

P- value 

He 
Hob 
fA 
fB 

P- value 

He 
Hob 
fA 
fB 

P- value 

He 
Hob 
fA 
fB 

P- value 

He 
Hob 
fA 
fB 

P-value 



AfrAmer Cauc Hisp 

rsl 0092491 [C/T] 

0.503 0.494 0.485 

0.575 0.467 0.489 

0.479 0.567 0.600 

0.521 0.433 0.400 

0.1546 0.6288 1.0000 



He 
Hob 
fA 
fB 

P-value 

He 
Hob 
fA 
fB 

P- value 



rsl 2997453 [A/G] 



0.441 
0.486 
0.324 
0.676 
0.4391 



0.568 
0.514 
0.486 
0.3532 



0.554 
0.358 
0.642 
0.0804 



0.548 
0.479 
0.521 
0.3494 



0.541 
0.541 
0.459 
0.6343 



0.494 0.463 
0.600 0.444 
0.433 0.356 
0.567 0.644 
0.1040 1.0000 
rsl 358856 [A/C] 
0.503 0.503 0.499 
0.640 0.533 
0.507 0.444 
0.493 0.556 
0.0222 0.7660 
rs1821380 [C/G] 
0.463 0.483 0.502 
0.480 0.511 
0.400 0.456 
0.600 0.544 
1.0000 1.0000 
rs2567608 [A/G] 
0.503 0.503 0.497 
0.560 0.422 
0.520 0.567 
0.480 0.433 
0.3567 0.3770 
rs3780962 [C/T] 
0.500 0.498 0.505 
0.493 0.444 
0.553 0.511 
0.447 0.489 
1.0000 0.5570 
rs6591147 [C/T] 
0.503 0.464 0.490 
0.480 0.422 
0.640 0.589 
0.360 0.411 
0.7994 0.3610 
rs7520386 [A/G] 
0.503 0.500 0.493 
0.573 
0.540 
0.460 
0.1666 



AfrAmer Cauc Hisp 

rs1019029 [C/T] 
0.500 0.503 0.485 
0.479 0.733 0.578 
0.459 0.487 0.400 
0.541 0.513 0.600 
0.6425 0.0001 0.2150 

rsl 3 134862 [A/G] 
0.492 0.464 0.475 
0.419 0.560 0.400 
0.426 0.360 0.378 
0.574 0.64 0 0.622 
0.1492 0.0840 0.3470 

rs1410059 [C/T] 
0.480 0.502 0.497 
0.405 0.467 0.467 
0.608 0.473 0.567 
0.392 0.527 0.433 
0.2234 0.4888 0.7510 

rs2073383 [C/T] 
0.498 0.490 0.502 
0.595 0.467 0.556 
0.554 0.580 0.544 
0.446 0.420 0.456 
0.1058 0.8075 0.5460 

rs279844 [A/T] 
0.501 0.498 0.506 
0.554 0.493 0.600 
0.466 0.553 0.500 
0.534 0.447 0.500 
0.4816 1.0000 0.2420 



AfrAmer Cauc Hisp 

rsl 04887 10 [C/G] 



AfrAmer Cauc Hisp 

rsl 058083 [A/G] 



rs445251 [C/G] 



0.477 
0.473 
0.385 
0.615 
0.8014 



0.490 
0.452 
0.418 
0.582 
0.4720 



0.485 
0.489 
0.400 
0.600 
1.0000 



0.496 
0.554 
0.561 
0.439 
0.2272 



0.477 
0.427 
0.613 
0.387 
0.4631 



0.490 
0.467 
0.589 
0.411 
0.7510 



0.432 
0.486 
0.514 
0.1515 



0.486 
0.541 
0.459 
0.8134 



0.608 
0.561 
0.439 
0.0614 



0.466 
0.562 
0.438 
0.6339 



0.501 
0.527 
0.466 
0.534 
0.6435 



0.474 
0.413 
0.380 
0.620 
0.3191 



0.468 
0.545 
0.364 
0.636 
0.3400 



AfrAmer Cauc Hisp 

rsl 109037 [A/G] 

0.503 0.503 0.506 

0.507 0.547 0.378 

0.514 0.513 0.500 

0.486 0.487 0.500 

1.0000 0.4956 0.1320 



rs6811238[G/T] 



0.527 
0.520 
0.480 
0.8144 



0.466 
0.507 
0.493 
0.6534 



0.500 
0.479 
0.459 
0.541 
0.6553 



0.485 0.494 

0.486 0.360 

0.595 0.567 

0.405 0.433 

0.8135 0.0211 

rs7704770 [A/G] 

0.480 0.490 0.499 

0.486 0.440 

0.608 0.580 

0.392 0.420 

1.0000 0.4948 

^Abbreviations: He - expected heterozygosity; Hob - observed heterozygosity; fA - frequency of allele A; fB 
exact test for Hardy-Weinberg equilibrium. 



0.578 
0.578 
0.422 
0.3550 



0.505 

0.444 

0.511 

0.489 

0.5700 



0.477 
0.557 
0.443 
1.0000 



rsl 31 82883 [A/G] 
0.503 0.488 0.502 
0.427 0.318 
0.413 0.455 
0.587 0.545 
0.3523 0.0230 
rsl 478829 [A/T] 
0.500 0.486 0.481 
0.467 0.467 
0.593 0.611 
0.407 0.389 
0.8144 1.0000 
rs214955 [A/G] 
0.496 0.503 0.493 
0.520 0.568 
0.487 0.420 
0.513 0.580 
0.8210 0.3590 
rs315791 [A/C] 
0.496 0.503 0.481 
0.520 0.556 
0.513 0.611 
0.487 0.389 
0.8214 0.3430 
rs447818[A/G] 
0.496 0.490 0.497 
0.658 0.440 0.511 
0.438 0.420 0.433 
0.562 0.580 0.567 
0.0047 0.4735 1.0000 

rs7205345 [C/G] 
0.500 0.500 0.485 
0.514 0.520 0.622 
0.541 0.540 0.600 
0.459 0.460 0.400 
1.0000 0.6288 0.0700 

rs985492 [C/T] 
0.503 0.496 0.497 
0.480 0.511 
0.440 0.433 
0.560 0.567 
0.8242 1.0000 



rsl 32 18440 [A/G] 



0.462 
0.438 
0.356 
0.644 
0.6126 



0.417 0.497 

0.373 0.556 

0.293 0.433 

0.707 0.567 

0.4118 0.5650 



rsl 523537 [C/T] 



0.477 
0.365 
0.385 
0.615 
0.0261 



0.468 0.493 

0.467 0.400 

0.367 0.422 

0.633 0.578 

0.7980 0.2000 



rs2272998 [C/G] 



0.492 
0.446 
0.426 
0.574 
0.3412 



0.468 0.490 

0.493 0.467 

0.367 0.411 

0.633 0.589 

0.6295 0.7660 



rs321198 [C/T] 



0.488 
0.500 
0.588 
0.412 
0.8141 



0.496 0.505 

0.466 0.378 

0.562 0.522 

0.438 0.478 

0.6342 0.1700 



rs560681 [A/G] 



0.415 
0.419 
0.709 
0.291 
1.0000 



0.433 0.425 

0.387 0.422 

0.687 0.700 

0.313 0.300 

0.4299 1.0000 



rs7229946 [A/G] 



rs987640 [A/T] 



0.541 
0.514 
0.486 
0.5155 



0.503 
0.473 
0.480 
0.520 
0.6496 



0.488 0.449 
0.507 0.578 
0.413 0.333 
0.587 0.667 
0.8243 0.1150 
frequency 



rsl 336071 [A/G] 



0.458 
0.397 
0.349 
0.651 
0.2884 



0.459 
0.527 
0.473 
0.4775 



0.562 
0.610 
0.390 
0.1675 



0.507 
0.459 
0.541 
0.8112 



0.405 
0.568 
0.432 
0.1564 



0.494 0.475 
0.440 0.356 
0.433 0.378 
0.567 0.622 
0.3453 0.1140 
rsl 554472 [C/T] 
0.502 0.502 0.503 
0.547 0.533 
0.473 0.467 
0.527 0.533 
0.3641 0.7560 
rs2503107[A/C] 
0.479 0.490 0.493 
0.573 0.489 
0.580 0.578 
0.420 0.422 
0.0947 1.0000 
rs338882 [C/T] 
0.500 0.503 0.502 
0.560 0.422 
0.507 0.544 
0.493 0.456 
0.2645 0.3640 
rs6444724 [C/T] 
0.494 0.496 0.505 
0.533 0.556 
0.440 0.522 
0.560 0.478 
0.6400 0.5500 



rs740598 [A/G] 



0.499 0.497 

0.480 0.511 

0.453 0.433 

0.547 0.567 

0.8186 1.0000 



0.487 0.452 0.481 
0.466 0.493 0.511 
0.589 0.660 0.611 
0.411 0.340 0.389 
0.6048 0.3051 0.7620 

rs9951171 [A/G] 
0.490 0.503 0.502 
0.459 0.560 0.600 
0.419 0.493 0.544 
0.581 0.507 0.456 
0.6274 0.2435 0.2270 
of allele B; P-value - Fisher 
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TABLE 3. Summary of random match probabilities calculated 
(RMP) for the three sample groups. The median values, mini- 
mum, and maximum observed RMPs for each sample group 
are listed. 



Combined random 


African 






match probability 


American 


Caucasian 


Hispanic 


Median 


3.37E-18 


2.53E-18 


2.29E-18 


Minimum 


2.15E-20 


4.09E-21 


2.86E-21 


Maximum 


2.39E-16 


4.76E-16 


7.35E-17 



The median combined RMP across the 3 US sample 
groups was approximately 2.73 x 1 0~ 18 with a minimum of 
2.39x1 0 -16 (African American sample) and maximum of 
2.86 x10 -21 (Hispanic sample). Pakstis et al reported an av- 
erage RMP across 40 populations of 10~ 16 with a range of 
2.02x10" l7 to 1.29x10" l3 (8). 

This is the first demonstration of a multiplex assay for typ- 
ing this specific panel of 40 autosomal SNPs. We found 
the PLEX-ID instrument and SNP-40 assay to be a robust 
and automated method to type SNP markers. The time re- 
quired to genotype 40 SNPs for 10 samples from start to 
finish (PCR to amplicon detection) was approximately 4.5 
hours. The average time required to review a plate (1 0 sam- 
ples plus positive and negative controls) in the IbisTrack 
software was approximately 15 minutes. The allele fre- 
quencies calculated for the US sample groups were found 
to be in agreement with published values, with the pos- 
sible exception of rs1 01 9029. The allele frequencies are the 
first derived from the NIST US populations sample set for 
this panel of 40 SNP markers intended as a universal panel 
for individual identification (8). 

Access to the data Genotyping results are available at: http://www.cstl.nist. 
gov/biotech/strbase/NISTpop.htm. 
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